Add RAG Documentation by SashenkaG · Pull Request #161 · wso2/docs-devant

SashenkaG · 2026-02-18T09:48:15Z

Description

This PR introduces several improvements and updates to the RAG (Retrieval-Augmented Generation) documentation, including:

Documentation Restructuring and Navigation Updates:

Split the original rag-application.md into four focused guides: rag-ingestion.md, rag-retrieval.md, schedule-rag-automation.md, and rag-service.md, each covering a specific aspect of RAG workflows.
Updated the sidebar navigation in mkdocs.yml to group RAG documentation under a new "RAG" section, with clear links to Ingestion, Retrieval, Schedule Automation, and Service guides.
Changed the homepage AI Agents section to link to the new "RAG Ingestion" guide instead of the old "RAG Application" guide.

RAG Ingestion Guide Improvements:

Expanded the ingestion guide with clearer instructions, improved UI navigation directions and included updates GIFs and screenshots.

New Guides and Features:

Added a new "RAG Retrieval" guide with instructions on retrieving ingested data, including configuration, querying, and reranking details.
Added a new "Schedule RAG Automation" guide describing how to set up scheduled ingestion from external datasources (Google Drive, S3).
Added a new "RAG Service" guide detailing the available REST API endpoints for programmatic ingestion, retrieval, chunking, and health checks, with example requests and responses.

Resolves issue #2201, #2060

Summary by CodeRabbit

Documentation
- Restructured RAG docs with standardized "RAG ingestion" wording, reworded/renumbered steps, and updated console navigation instructions.
- Added a "Choose ingestion mode" step (Upload Now or Schedule) and expanded supported sources (PDFs, scanned PDFs, Office, HTML/MD, images, audio, CSV).
- Published separate guides for RAG retrieval, RAG service endpoints, and scheduled RAG automation (Google Drive example).
- Updated site navigation, AI Quick Start link, and added new GIFs and video note.

…on documentation

coderabbitai · 2026-02-18T09:48:37Z

Walkthrough

Split the original RAG guide into a multi-page RAG section (Ingestion, Retrieval, Service, Schedule Automation), updated navigation and homepage links, revised ingestion flow (new "Choose ingestion mode"), added support details for multiple file types, new media, and created API/service documentation.

Changes

Cohort / File(s)	Summary
RAG docs (rework + additions) `en/docs/ai/rag-ingestion.md`, `en/docs/ai/rag-retrieval.md`, `en/docs/ai/rag-service.md`, `en/docs/ai/schedule-rag-automation.md`	Reworked ingestion doc (terminology, renamed steps, added "Choose ingestion mode", broadened supported file types, updated UI navigation and images/gifs, removed embedded retrieval section). Added new retrieval guide (Pinecone-backed flow, query params, reranking note), new service doc (endpoints: `/upload`, `/retrieve`, `/chunks`, `/health` with payloads), and scheduled automation guide (create automation, datasource config, scheduling, logs).
Navigation config `en/mkdocs.yml`	Replaced single "RAG Application" entry with a new "RAG" nav group containing Ingestion, Retrieval, Schedule Automation, and Service pages.
Quick Start / homepage `en/theme/material/templates/home-page.html`	Updated AI/Quick Start link label and anchor from "RAG Application" to "RAG Ingestion" and pointed the link to `rag-ingestion`.
Media referenced in docs `en/docs/ai/rag-ingestion.md` (images: `rag-ingestion1.gif`, `rag-ingestion2.gif`, `rag-ingestion3.gif`)	Added and repositioned animated images/gifs and a video tutorial reference; updated screenshots and UI copy to match new flow.
Text edits / copy `en/docs/ai/rag-ingestion.md`, `en/docs/ai/rag-retrieval.md`	Standardized terminology (lowercased "RAG ingestion", "Configure chunking", etc.), updated API/key references (Pinecone API key, OpenAI platform), and adjusted navigation instructions to use console header org selection and left navigation path.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client (UI/API)
    participant Service as RAG Service
    participant Embed as Embedding API
    participant Vector as Vector DB
    participant Store as Object Storage

    rect rgba(200,230,255,0.5)
    Client->>Service: POST /upload (file, metadata)
    Service->>Store: Persist original file (optional)
    Service->>Embed: Request embeddings for chunks
    Embed-->>Service: Return embedding vectors
    Service->>Vector: Upsert vectors into index
    Vector-->>Service: Ack upsert
    Service-->>Client: Upload response (status, ids)
    end

    rect rgba(200,255,200,0.5)
    Client->>Service: POST /retrieve (query, params)
    Service->>Embed: Embed query text
    Embed-->>Service: Return query vector
    Service->>Vector: Search (top N, threshold)
    Vector-->>Service: Retrieved chunks
    Service->>Service: (optional) Rerank / filter results
    Service-->>Client: Retrieved results (documents, metadata)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibbled the docs and split every page,
Ingest, retrieve, service — each earned its stage.
A scheduler hums and GIFs hop in time,
I hopped through the changes and left a small rhyme. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Add RAG Documentation' is generic and does not convey the primary change—restructuring and expanding RAG documentation into four focused guides.	Consider a more specific title such as 'Restructure RAG documentation into four focused guides' or 'Split RAG documentation into Ingestion, Retrieval, Automation, and Service guides'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description is well-structured and covers key changes, objectives, and links to related issues, but does not fully address the template sections (Purpose/Goals/Approach are implicit, but explicit sections for User stories, Release notes, Documentation, Training, Certification, Marketing, Testing, Security, Samples, and Test environment are missing or marked incomplete).
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (1)

en/docs/ai/schedule-rag-automation.md (1)

29-29: Amazon S3 is listed as a supported datasource but has no corresponding setup instructions.

Step 2 only covers Google Drive. Users who select S3 will have no guidance.

Either add S3 configuration steps, or soften the claim to reflect that only Google Drive is currently documented:

✏️ Proposed interim fix

-The datasource specifies the location from which files will be ingested. Devant supports both Google Drive folders and Amazon S3 buckets as datasources.
+The datasource specifies the location from which files will be ingested. Devant supports Google Drive folders and Amazon S3 buckets as datasources. The steps below walk through Google Drive; refer to the platform documentation for S3 configuration.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@en/docs/ai/schedule-rag-automation.md` at line 29, The docs claim that Devant
supports Google Drive folders and Amazon S3 buckets but Step 2 only documents
Google Drive; either add missing S3 setup instructions (how to create an S3
bucket, IAM policy/credentials, environment variables or config keys the app
expects, and any required path/key format) and link them from the "datasource"
paragraph, or change the "datasource specifies..." sentence to only mention
Google Drive (e.g., "currently supports Google Drive folders") and note S3 is
planned; update the "Step 2" heading to reflect the chosen change so S3 users
are not left without guidance.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@en/docs/ai/rag-ingestion.md`:
- Line 13: Fix the capitalization and align the supported file-type list with
rag-service.md: change "PDFS" to "PDFs" and expand the list in the sentence that
starts with "Devant RAG ingestion has support for multiple file types" to
include PDF (including scanned), DOCX, PPTX, XLSX, HTML, CSV, MD, images, and
audio formats MP3, WAV, OGG, M4A, FLAC so the documentation in rag-ingestion.md
matches rag-service.md.

In `@en/docs/ai/rag-retrieval.md`:
- Line 50: Change the lowercase phrase "rag retrieval" to capitalized "RAG
retrieval" in the sentence "After completing the RAG ingestion process, you can
also implement a rag retrieval to connect your vector database with user queries
and generate responses using the WSO2 Integrator: BI." so the term matches the
acronym formatting; update the instance in the file content where "rag
retrieval" appears.
- Around line 45-46: Replace the incorrect admonition token "!!!info" with the
correct Markdown admonition syntax "!!! info" in the document (look for the
string "!!!info" in the content snippet) so the block renders properly; update
any identical occurrences in this file (and optionally mirror the same fix in
rag-service.md) to maintain consistent admonition formatting.
- Around line 6-9: Fix the prose and formatting in the RAG retrieval doc: change
the note marker from "!!!note" to "!!! note", correct "in to" to "into", change
"havent" to "haven't", and remove the duplicated phrase so "navigate to your
organization using the **Organization** dropdown in the in the top left of the
Devant console header" becomes "navigate to your organization using the
**Organization** dropdown in the top left of the Devant console header"; update
the note block and the navigation sentence accordingly in the rag-retrieval.md
content.

In `@en/docs/ai/rag-service.md`:
- Line 153: Fix the typo in the sentence containing the link text "[RAG
Ingesion]" by changing "RAG Ingesion" to "RAG Ingestion" so the sentence reads
"For more details on RAG ingestion and retrieval, and how to obtain API keys and
credentials refer the [RAG Ingestion](rag-ingestion.md) guide." Ensure only the
displayed link text is corrected without altering the target URL
"rag-ingestion.md".
- Around line 32-33: The admonition uses incorrect syntax "!!!note" which won't
be recognized; update the string in en/docs/ai/rag-service.md to include the
required space so it reads "!!! note" (match the correct usage already present
elsewhere in the file) to ensure Python-Markdown's admonition extension renders
the note block properly.
- Line 26: Fix the spelling mistake in the documentation sentence that currently
reads "On the devolopment environment card…" by changing "devolopment" to
"development" so the sentence becomes "On the development environment card,
click **Test** to open the OpenAPI Console…"; update this text in the
en/docs/ai/rag-service.md content where that sentence appears.

In `@en/docs/ai/schedule-rag-automation.md`:
- Line 36: Update the sentence about creating an API key to clarify that
standard Google Cloud API keys cannot access private Drive folders and to
specify what credential format the product's "API Key" field accepts (e.g.,
service account JSON, OAuth 2.0 client/refresh token, or domain-wide delegated
service account) and then replace or augment the existing link with direct links
to the appropriate workflows for creating service account credentials and OAuth
2.0 credentials (Google Drive service account JSON setup and OAuth
consent/refresh token guides) so users follow the correct credential creation
path for private Drive access.

---

Nitpick comments:
In `@en/docs/ai/schedule-rag-automation.md`:
- Line 29: The docs claim that Devant supports Google Drive folders and Amazon
S3 buckets but Step 2 only documents Google Drive; either add missing S3 setup
instructions (how to create an S3 bucket, IAM policy/credentials, environment
variables or config keys the app expects, and any required path/key format) and
link them from the "datasource" paragraph, or change the "datasource
specifies..." sentence to only mention Google Drive (e.g., "currently supports
Google Drive folders") and note S3 is planned; update the "Step 2" heading to
reflect the chosen change so S3 users are not left without guidance.

en/docs/ai/rag-ingestion.md

en/docs/ai/rag-retrieval.md

en/docs/ai/rag-service.md

en/docs/ai/schedule-rag-automation.md

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

en/docs/ai/rag-service.md (1)
27-30: Try it out should be bolded to match the UI-element formatting convention used elsewhere.

Every other interactive UI label in this list (**Test**, **Execute**) is bolded; Try it out on Line 28 is the odd one out.
✏️ Proposed fix
-3. Click Try it out to enable it.
+3. Click **Try it out** to enable it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@en/docs/ai/rag-service.md` around lines 27 - 30, Make the UI label "Try it
out" bold to match the existing convention used for other interactive labels;
locate the step text containing the plain "Try it out" in the steps list (the
same block that contains "**Test**" and "**Execute**") and change it so the
label is wrapped in markdown bold markers (i.e., make it "**Try it out**") to
keep formatting consistent with the other UI elements.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@en/docs/ai/rag-retrieval.md`:
- Line 19: The sentence "Enter the **Collection Name**. The collection will be
automatically created if it does not exist." in step 3 is misleading for
retrieval-only flows; remove the second sentence and replace it with a short
note that the named collection must already exist and contain indexed documents
(e.g., "Enter the **Collection Name**. Note: for retrieval, the collection must
already exist and be populated with indexed documents; an empty or non-existent
collection will return no results."). Update the text in the
en/docs/ai/rag-retrieval.md step 3 accordingly.

In `@en/docs/ai/schedule-rag-automation.md`:
- Around line 29-43: The docs currently say both Google Drive and Amazon S3 are
supported but only show Google Drive steps; add a parallel "Amazon S3"
subsection under the datasource selection that lists the required fields (Bucket
name/URL, Region, Access Key ID, Secret Access Key, optional Endpoint for
S3-compatible providers, and any IAM role or public/private access notes),
provide brief guidance on where to find the bucket name and region, and either
include example steps for entering these values in the UI or add a cross-link to
existing S3 configuration documentation if one exists (mirror the structure used
in the Google Drive steps and include an info block for important notes such as
bucket permissions).

---

Duplicate comments:
In `@en/docs/ai/rag-retrieval.md`:
- Around line 45-46: The admonition is written as '!!!info' which won’t render;
change the token to include a space — use '!!! info' (and keep the same
indentation level) so Python-Markdown's admonition extension recognizes it and
renders the reranking note as a styled info block.
- Around line 6-7: Fix the grammar in the note sentence inside the RAG retrieval
guide: change the line that currently reads "If you haven't already follow the
[Ingestion](rag-ingestion.md) guide on how you can do that." to a corrected,
concise phrasing such as "If you haven't already, follow the
[Ingestion](rag-ingestion.md) guide to ingest files into your vector store."
Ensure the comma after "already" is added and the trailing informal clause is
replaced with the clearer instruction; update the note block where this sentence
appears.

In `@en/docs/ai/rag-service.md`:
- Around line 152-153: Fix the ungrammatical sentence in the doc: replace the
line that currently reads "For more details on RAG ingestion and retrieval, and
how to obtain API keys and credentials refer the [RAG
Ingestion](rag-ingestion.md) guide." with a corrected version that inserts the
missing comma and corrects "refer the" to "refer to the", e.g. "For more details
on RAG ingestion and retrieval, and how to obtain API keys and credentials,
refer to the [RAG Ingestion](rag-ingestion.md) guide." Ensure you update the
exact sentence text in the en/docs/ai/rag-service.md content.

In `@en/docs/ai/schedule-rag-automation.md`:
- Around line 35-38: Update the guidance in the "schedule-rag-automation.md" doc
so the public-folder limitation is explicit at the top of Step 2 rather than
only inside the collapsible info block: move or duplicate the sentence "The
target folder must be public ('Anyone with the link'), as API keys cannot access
private files." to the start of Step 2 and add a short sentence naming supported
alternatives (OAuth 2.0 or service account credentials) for users who need
private-folder access or note that private access is not supported if that is
intentional; modify the existing info block (the "Info" block shown) to
reference the top-level note and/or provide the OAuth/service-account options.

---

Nitpick comments:
In `@en/docs/ai/rag-service.md`:
- Around line 27-30: Make the UI label "Try it out" bold to match the existing
convention used for other interactive labels; locate the step text containing
the plain "Try it out" in the steps list (the same block that contains
"**Test**" and "**Execute**") and change it so the label is wrapped in markdown
bold markers (i.e., make it "**Try it out**") to keep formatting consistent with
the other UI elements.

en/docs/ai/rag-retrieval.md

coderabbitai · 2026-02-18T12:23:13Z

en/docs/ai/schedule-rag-automation.md

+The datasource specifies the location from which files will be ingested. Devant supports both Google Drive folders and Amazon S3 buckets as datasources.
+
+1. Select `Google Drive` as the datasource.
+
+2. Enter the **API Key** in the API key field.
+
+    ???+ info "Info"
+        To obtain a key, use the [Google Cloud Console](https://console.cloud.google.com/) to create a project and generate a key as described in the [Google Documentation](https://cloud.google.com/docs/authentication/api-keys#create) and restrict it to the **Google Drive API**.
+
+        **Note:** The target folder must be **public** ("Anyone with the link"), as API keys cannot access private files.
+
+3. Provide the **Folder ID** of the Google Drive folder containing the files to be ingested.
+
+    ???+ info "Info"
+        The Google Drive folder ID can be found in the URL when viewing the folder in Google Drive. It is the string that appears after `/folders/` in the URL.


⚠️ Potential issue | 🟡 Minor

S3 datasource is mentioned as supported but its configuration steps are not documented.

Line 29 explicitly states both Google Drive and Amazon S3 are supported, yet the guide only walks through Google Drive. Users choosing Amazon S3 as the datasource have no instructions for providing the bucket URL, region, access key, or secret key.

Either document the S3 configuration in a sub-section (parallel to the Google Drive sub-section), or note that S3 documentation is covered elsewhere and link to it.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@en/docs/ai/schedule-rag-automation.md` around lines 29 - 43, The docs currently say both Google Drive and Amazon S3 are supported but only show Google Drive steps; add a parallel "Amazon S3" subsection under the datasource selection that lists the required fields (Bucket name/URL, Region, Access Key ID, Secret Access Key, optional Endpoint for S3-compatible providers, and any IAM role or public/private access notes), provide brief guidance on where to find the bucket name and region, and either include example steps for entering these values in the UI or add a cross-link to existing S3 configuration documentation if one exists (mirror the structure used in the Google Drive steps and include an info block for important notes such as bucket permissions).

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@en/docs/ai/rag-retrieval.md`:
- Line 19: Fix the typo in the sentence "Enter the **Collection Name** from
which you want to retrive data." by changing "retrive" to "retrieve" so the line
reads "Enter the **Collection Name** from which you want to retrieve data.";
update that exact string in en/docs/ai/rag-retrieval.md.
- Line 24: Replace the incorrect brand spacing in the sentence "Select
`text-embedding-ada-002` embedding model from the **Open AI** dropdown." by
changing "**Open AI**" to "**OpenAI**" so the brand name is a single word
(consistent with the other occurrence on line 28); update the phrase in the same
string or markdown element where that sentence appears (search for
"text-embedding-ada-002" to locate it).

---

Duplicate comments:
In `@en/docs/ai/rag-retrieval.md`:
- Line 7: Update the sentence "This guide assumes you have already ingested
files into your vector store. If you haven't already follow the
[Ingestion](rag-ingestion.md) guide on how you can do that." by inserting a
comma after "already" so it reads "If you haven't already, follow the
[Ingestion](rag-ingestion.md) guide..." — locate the sentence in
en/docs/ai/rag-retrieval.md and apply the punctuation fix to the second
sentence.

en/docs/ai/rag-retrieval.md

Bhashinee · 2026-02-19T08:13:26Z

en/docs/ai/rag-ingestion.md

+ ![RAG ingestion](../assets/img/ai/rag-application/rag-ingestion1.gif)
+
+### Step 4: Choose ingestion mode
+


Suggested change

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@en/docs/ai/rag-retrieval.md`:
- Around line 45-46: The admonition block starting with '???+ info "Info"' has
its list item line indented with 5 spaces, which can render as a code block;
change the indentation of the line beginning with '-' to exactly 4 spaces so the
list item is parsed correctly (locate the '???+ info "Info"' block and the
subsequent line that currently starts with five spaces and reduce it to four).
- Line 52: The external link labeled "RAG retrieval"
(https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval)
in en/docs/ai/rag-retrieval.md is returning 403 and must be removed or replaced;
update the sentence that references the RAG retrieval tutorial to either (a)
point to a working public URL, (b) embed the essential implementation
steps/configuration directly into this doc under a new "RAG retrieval"
subsection, or (c) remove the cross-reference entirely and adjust surrounding
text accordingly so there are no broken links or references to the inaccessible
WSO2 Integrator: BI documentation.

In `@en/docs/ai/rag-service.md`:
- Line 53: The docs use two different names for the same parameter — "Max
segment size" in the /upload section and "Max chunk size" in the /chunks
section; standardize on a single term (use "Max chunk size") by replacing all
occurrences of "Max segment size" in the /upload endpoint text with "Max chunk
size" and ensure any related descriptions, examples, and parameter headings in
both the /upload and /chunks sections reference the same exact phrase so the
parameter name is consistent across the documentation.
- Around line 18-28: Fix minor prose issues in the Create/Test service steps:
change "Click the **Create Service** button and your service will get created "
to "Click the **Create Service** button and your service will be created."
(remove trailing space and add period), add a comma in "Once the component is
created you will be redirected to the Overview page." so it reads "Once the
component is created, you will be redirected to the Overview page.", and
bold-format the UI element "Try it out" so it appears as **Try it out** (to
match the already bolded **Execute**).

In `@en/docs/ai/schedule-rag-automation.md`:
- Around line 10-11: Update the prerequisite reference in
schedule-rag-automation.md so it correctly matches the current rag-ingestion.md:
either change the text that currently says "steps 1 to 4" to "steps 1 to 6" or
remove the specific step count and refer generically to "the RAG ingestion
guide" (e.g., "This guide assumes you have completed the steps in the RAG
ingestion guide.") to avoid mismatches; locate and edit the line that mentions
rag-ingestion.md to apply the chosen wording.

---

Duplicate comments:
In `@en/docs/ai/rag-retrieval.md`:
- Line 7: The sentence in the documentation line "If you haven't already follow
the [Ingestion](rag-ingestion.md) guide on how you can do that." is missing a
comma; update the sentence in en/docs/ai/rag-retrieval.md to read "If you
haven't already, follow the [Ingestion](rag-ingestion.md) guide on how you can
do that." so it is grammatically correct.

In `@en/docs/ai/rag-service.md`:
- Line 151: Update the sentence fragment "For more details on RAG ingestion and
retrieval, and how to obtain API keys and credentials refer the [RAG
Ingestion](rag-ingestion.md) guide." to read "For more details on RAG ingestion
and retrieval, and how to obtain API keys and credentials, refer to the [RAG
Ingestion](rag-ingestion.md) guide." — locate the exact string in rag-service.md
and insert the comma and the word "to" as shown.

In `@en/docs/ai/schedule-rag-automation.md`:
- Around line 29-43: Add a parallel Amazon S3 subsection under the "datasource"
section to document S3-specific fields and steps: specify required inputs
(Bucket name, Region, Access Key ID, Secret Access Key, optional Session Token,
and optional Prefix/path), describe how to find the bucket name and region,
explain required IAM permissions or public access settings, and include an
example or note about setting object ACLs or making objects accessible;
alternatively add a clear cross-link from the Google Drive steps to a new or
existing "S3 datasource configuration" doc and ensure the terms "Amazon S3",
"S3", and "datasource" are mentioned so users selecting S3 can find the
configuration guidance.

coderabbitai · 2026-02-19T08:49:05Z

en/docs/ai/rag-retrieval.md

+???+ info "Info"
+     - Devant's retrieval process uses a reranking model to ensure that only the most accurate and contextually relevant chunks are returned.


⚠️ Potential issue | 🟡 Minor

???+ block content indented with 5 spaces instead of 4 — potential rendering issue.

Python-Markdown's admonition extension expects exactly 4 spaces of indentation for block content. The current 5-space indent ( -) could cause the line to be parsed as a code block rather than a list item within the collapsible note.

✏️ Proposed fix

???+ info "Info" - - Devant's retrieval process uses a reranking model to ensure that only the most accurate and contextually relevant chunks are returned. + - Devant's retrieval process uses a reranking model to ensure that only the most accurate and contextually relevant chunks are returned.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@en/docs/ai/rag-retrieval.md` around lines 45 - 46, The admonition block starting with '???+ info "Info"' has its list item line indented with 5 spaces, which can render as a code block; change the indentation of the line beginning with '-' to exactly 4 spaces so the list item is parsed correctly (locate the '???+ info "Info"' block and the subsequent line that currently starts with five spaces and reduce it to four).

coderabbitai · 2026-02-19T08:49:05Z

en/docs/ai/rag-retrieval.md

+
+After completing the RAG ingestion process, you can also implement a RAG retrieval to connect your vector database with user queries and generate responses using the WSO2 Integrator: BI.
+
+For detailed implementation steps and configuration, refer to the [RAG retrieval](https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval) tutorial in the WSO2 Integrator: BI documentation.


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check the HTTP status of the WSO2 BI RAG tutorial link curl -o /dev/null -s -w "%{http_code}\n" \ "https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval"

Repository: wso2/docs-devant

Length of output: 63

Remove or replace the inaccessible external link. The referenced WSO2 Integrator: BI documentation link returns a 403 Forbidden status, making it inaccessible to users. Either provide a working link, embed the relevant information directly in the documentation, or remove the reference if the external resource is no longer available.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@en/docs/ai/rag-retrieval.md` at line 52, The external link labeled "RAG retrieval" (https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval) in en/docs/ai/rag-retrieval.md is returning 403 and must be removed or replaced; update the sentence that references the RAG retrieval tutorial to either (a) point to a working public URL, (b) embed the essential implementation steps/configuration directly into this doc under a new "RAG retrieval" subsection, or (c) remove the cross-reference entirely and adjust surrounding text accordingly so there are no broken links or references to the inaccessible WSO2 Integrator: BI documentation.

coderabbitai · 2026-02-19T08:49:05Z

en/docs/ai/rag-service.md

+Click the **Create Service** button and your service will get created 
+
+!!! note
+    - When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.
+
+### Step 2: Test endpoints
+Once the component is created you will be redirected to the Overview page.
+
+1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
+2. Expand the resource you want to test.
+3. Click Try it out to enable it.


⚠️ Potential issue | 🟡 Minor

Several minor prose issues in the Create/Test service steps.

Line 18: missing period and trailing space; "will get created" → "will be created."

Line 24: missing comma — "Once the component is created**,** you will be redirected…"

Line 28: Try it out is a UI element and should be bold-formatted, consistent with **Execute** on line 30.

✏️ Proposed fix

-Click the **Create Service** button and your service will get created +Click the **Create Service** button and your service will be created.

-Once the component is created you will be redirected to the Overview page. +Once the component is created, you will be redirected to the Overview page.

-3. Click Try it out to enable it. +3. Click **Try it out** to enable it.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Click the **Create Service** button and your service will get created

!!! note

- When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.

### Step 2: Test endpoints

Once the component is created you will be redirected to the Overview page.

1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.

2. Expand the resource you want to test.

3. Click Try it out to enable it.

Click the **Create Service** button and your service will be created.

!!! note

- When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.

### Step 2: Test endpoints

Once the component is created, you will be redirected to the Overview page.

1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.

2. Expand the resource you want to test.

3. Click **Try it out** to enable it.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@en/docs/ai/rag-service.md` around lines 18 - 28, Fix minor prose issues in the Create/Test service steps: change "Click the **Create Service** button and your service will get created " to "Click the **Create Service** button and your service will be created." (remove trailing space and add period), add a comma in "Once the component is created you will be redirected to the Overview page." so it reads "Once the component is created, you will be redirected to the Overview page.", and bold-format the UI element "Try it out" so it appears as **Try it out** (to match the already bolded **Execute**).

coderabbitai · 2026-02-19T08:49:06Z

en/docs/ai/rag-service.md

+- Collection name: where data will be stored.
+- Embedding model provider and model (e.g., OpenAI, Azure, Mistral) and API key.
+- Chunking strategy: `recursive`, `sentence`, or `character`.
+- Max segment size and max overlap size for chunking.


⚠️ Potential issue | 🟡 Minor

Terminology inconsistency between /upload and /chunks: "Max segment size" vs "Max chunk size".

Line 53 (/upload) calls the parameter "Max segment size", while line 118 (/chunks) calls the same parameter "Max chunk size". Using different names for the same field across two closely related endpoints will confuse users who reference both sections while implementing API calls.

Pick one term (e.g., "Max chunk size") and use it consistently across both endpoints.

✏️ Proposed fix

-- Max segment size and max overlap size for chunking. +- Max chunk size and max overlap size for chunking.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- Max segment size and max overlap size for chunking.

- Max chunk size and max overlap size for chunking.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@en/docs/ai/rag-service.md` at line 53, The docs use two different names for the same parameter — "Max segment size" in the /upload section and "Max chunk size" in the /chunks section; standardize on a single term (use "Max chunk size") by replacing all occurrences of "Max segment size" in the /upload endpoint text with "Max chunk size" and ensure any related descriptions, examples, and parameter headings in both the /upload and /chunks sections reference the same exact phrase so the parameter name is consistent across the documentation.

coderabbitai · 2026-02-19T08:49:06Z

en/docs/ai/schedule-rag-automation.md

+    - Schedule RAG ingestion option is available only for paid Devant users.
+    - This guide assumes you have completed steps 1 to 4 in the [RAG ingestion](rag-ingestion.md) guide.


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find and display all step headings in rag-ingestion.md fd -e md -g "rag-ingestion.md" --exec cat -n {} | grep -E "^[^:]*:\s*#{2,4} (Step|step)"

Repository: wso2/docs-devant

Length of output: 42

🏁 Script executed:

#!/bin/bash # Find rag-ingestion.md in the repository git ls-files | grep -i "rag-ingestion"

Repository: wso2/docs-devant

Length of output: 258

🏁 Script executed:

#!/bin/bash # Read rag-ingestion.md and find all step headings cat -n en/docs/ai/rag-ingestion.md | grep -E "#{2,4} Step"

Repository: wso2/docs-devant

Length of output: 295

🏁 Script executed:

#!/bin/bash # View the exact content around line 11 in schedule-rag-automation.md to confirm the reference head -15 en/docs/ai/schedule-rag-automation.md | tail -8

Repository: wso2/docs-devant

Length of output: 369

Update the prerequisite step reference to match the current RAG ingestion guide.

Line 11 references "steps 1 to 4" in the RAG ingestion guide, but rag-ingestion.md now contains 6 steps (Step 1–6: Initialize vector store, Configure embedding model, Configure chunking, Choose ingestion mode, Upload source files, and Verify). Either update the reference to "steps 1 to 6" or remove the specific step count if only a subset is actually required as a prerequisite.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@en/docs/ai/schedule-rag-automation.md` around lines 10 - 11, Update the prerequisite reference in schedule-rag-automation.md so it correctly matches the current rag-ingestion.md: either change the text that currently says "steps 1 to 4" to "steps 1 to 6" or remove the specific step count and refer generically to "the RAG ingestion guide" (e.g., "This guide assumes you have completed the steps in the RAG ingestion guide.") to avoid mismatches; locate and edit the line that mentions rag-ingestion.md to apply the chosen wording.

SashenkaG added 5 commits February 16, 2026 09:47

Merge upstream/main

cbcdf02

Merge remote-tracking branch 'upstream/main'

eb3e6b3

Merge remote-tracking branch 'upstream/main'

985dc35

Add RAG ingestion, retrieval service and scheduled ingestion automati…

50d9d6d

…on documentation

Refactor RAG section in navigation and update link

c82260f

SashenkaG requested a review from pcnfernando as a code owner February 18, 2026 09:48

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

Resolve CodeRabbit reviews

99dc05e

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

Clarify instructions in RAG retrieval docs

f2abae4

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

en/docs/ai/rag-retrieval.md Outdated Show resolved Hide resolved

en/docs/ai/rag-retrieval.md Outdated Show resolved Hide resolved

Fix typos

b641d82

Bhashinee reviewed Feb 19, 2026

View reviewed changes

Remove unnecessary blank lines

87f5956

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

		![RAG ingestion](../assets/img/ai/rag-application/rag-ingestion1.gif)

		### Step 4: Choose ingestion mode

		???+ info "Info"
		- Devant's retrieval process uses a reranking model to ensure that only the most accurate and contextually relevant chunks are returned.


		After completing the RAG ingestion process, you can also implement a RAG retrieval to connect your vector database with user queries and generate responses using the WSO2 Integrator: BI.

		For detailed implementation steps and configuration, refer to the [RAG retrieval](https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval) tutorial in the WSO2 Integrator: BI documentation.

	- Max segment size and max overlap size for chunking.
	- Max chunk size and max overlap size for chunking.

		- Schedule RAG ingestion option is available only for paid Devant users.
		- This guide assumes you have completed steps 1 to 4 in the [RAG ingestion](rag-ingestion.md) guide.

Conversation

SashenkaG commented Feb 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Bhashinee Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

SashenkaG commented Feb 18, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 18, 2026 •

edited

Loading